Performance Comparison of Six Algorithms for Page Segmentation
Identifieur interne : 001029 ( Main/Exploration ); précédent : 001028; suivant : 001030Performance Comparison of Six Algorithms for Page Segmentation
Auteurs : Faisal Shafait [Allemagne, Niger] ; Daniel Keysers [Allemagne, Niger] ; M. Breuel [Allemagne, Niger]Source :
- Lecture Notes in Computer Science [ 0302-9743 ] ; 2006.
Abstract
Abstract: This paper presents a quantitative comparison of six algorithms for page segmentation: X-Y cut, smearing, whitespace analysis, constrained text-line finding, Docstrum, and Voronoi-diagram-based. The evaluation is performed using a subset of the UW-III collection commonly used for evaluation, with a separate training set for parameter optimization. We compare the results using both default parameters and optimized parameters. In the course of the evaluation, the strengths and weaknesses of each algorithm are analyzed, and it is shown that no single algorithm outperforms all other algorithms. However, we observe that the three best-performing algorithms are those based on constrained text-line finding, Docstrum, and the Voronoi-diagram.
Url:
DOI: 10.1007/11669487_33
Affiliations:
Links toward previous steps (curation, corpus...)
- to stream Istex, to step Corpus: 001621
- to stream Istex, to step Curation: 001528
- to stream Istex, to step Checkpoint: 000A02
- to stream Main, to step Merge: 001046
- to stream Main, to step Curation: 001029
Le document en format XML
<record><TEI wicri:istexFullTextTei="biblStruct"><teiHeader><fileDesc><titleStmt><title xml:lang="en">Performance Comparison of Six Algorithms for Page Segmentation</title>
<author><name sortKey="Shafait, Faisal" sort="Shafait, Faisal" uniqKey="Shafait F" first="Faisal" last="Shafait">Faisal Shafait</name>
</author>
<author><name sortKey="Keysers, Daniel" sort="Keysers, Daniel" uniqKey="Keysers D" first="Daniel" last="Keysers">Daniel Keysers</name>
</author>
<author><name sortKey="Breuel, M" sort="Breuel, M" uniqKey="Breuel M" first="M." last="Breuel">M. Breuel</name>
</author>
</titleStmt>
<publicationStmt><idno type="wicri:source">ISTEX</idno>
<idno type="RBID">ISTEX:73191E813FC49126BDFEF87AA23D9270B36FD3A7</idno>
<date when="2006" year="2006">2006</date>
<idno type="doi">10.1007/11669487_33</idno>
<idno type="url">https://api.istex.fr/document/73191E813FC49126BDFEF87AA23D9270B36FD3A7/fulltext/pdf</idno>
<idno type="wicri:Area/Istex/Corpus">001621</idno>
<idno type="wicri:Area/Istex/Curation">001528</idno>
<idno type="wicri:Area/Istex/Checkpoint">000A02</idno>
<idno type="wicri:doubleKey">0302-9743:2006:Shafait F:performance:comparison:of</idno>
<idno type="wicri:Area/Main/Merge">001046</idno>
<idno type="wicri:Area/Main/Curation">001029</idno>
<idno type="wicri:Area/Main/Exploration">001029</idno>
</publicationStmt>
<sourceDesc><biblStruct><analytic><title level="a" type="main" xml:lang="en">Performance Comparison of Six Algorithms for Page Segmentation</title>
<author><name sortKey="Shafait, Faisal" sort="Shafait, Faisal" uniqKey="Shafait F" first="Faisal" last="Shafait">Faisal Shafait</name>
<affiliation wicri:level="4"><country xml:lang="fr">Allemagne</country>
<wicri:regionArea>Image Understanding and Pattern Recognition (IUPR) research group, German Research Center for Artificial Intelligence (DFKI), and Technical University of Kaiserslautern, D-67663, Kaiserslautern</wicri:regionArea>
<placeName><region type="land" nuts="2">Rhénanie-Palatinat</region>
<settlement type="city">Kaiserslautern</settlement>
</placeName>
<orgName type="university">Université de technologie de Kaiserslautern</orgName>
</affiliation>
<affiliation wicri:level="1"><country wicri:rule="url">Niger</country>
</affiliation>
</author>
<author><name sortKey="Keysers, Daniel" sort="Keysers, Daniel" uniqKey="Keysers D" first="Daniel" last="Keysers">Daniel Keysers</name>
<affiliation wicri:level="4"><country xml:lang="fr">Allemagne</country>
<wicri:regionArea>Image Understanding and Pattern Recognition (IUPR) research group, German Research Center for Artificial Intelligence (DFKI), and Technical University of Kaiserslautern, D-67663, Kaiserslautern</wicri:regionArea>
<placeName><region type="land" nuts="2">Rhénanie-Palatinat</region>
<settlement type="city">Kaiserslautern</settlement>
</placeName>
<orgName type="university">Université de technologie de Kaiserslautern</orgName>
</affiliation>
<affiliation wicri:level="1"><country wicri:rule="url">Niger</country>
</affiliation>
</author>
<author><name sortKey="Breuel, M" sort="Breuel, M" uniqKey="Breuel M" first="M." last="Breuel">M. Breuel</name>
<affiliation wicri:level="4"><country xml:lang="fr">Allemagne</country>
<wicri:regionArea>Image Understanding and Pattern Recognition (IUPR) research group, German Research Center for Artificial Intelligence (DFKI), and Technical University of Kaiserslautern, D-67663, Kaiserslautern</wicri:regionArea>
<placeName><region type="land" nuts="2">Rhénanie-Palatinat</region>
<settlement type="city">Kaiserslautern</settlement>
</placeName>
<orgName type="university">Université de technologie de Kaiserslautern</orgName>
</affiliation>
<affiliation wicri:level="1"><country wicri:rule="url">Niger</country>
</affiliation>
</author>
</analytic>
<monogr></monogr>
<series><title level="s">Lecture Notes in Computer Science</title>
<imprint><date>2006</date>
</imprint>
<idno type="ISSN">0302-9743</idno>
<idno type="eISSN">1611-3349</idno>
<idno type="ISSN">0302-9743</idno>
</series>
<idno type="istex">73191E813FC49126BDFEF87AA23D9270B36FD3A7</idno>
<idno type="DOI">10.1007/11669487_33</idno>
<idno type="ChapterID">33</idno>
<idno type="ChapterID">Chap33</idno>
</biblStruct>
</sourceDesc>
<seriesStmt><idno type="ISSN">0302-9743</idno>
</seriesStmt>
</fileDesc>
<profileDesc><textClass></textClass>
<langUsage><language ident="en">en</language>
</langUsage>
</profileDesc>
</teiHeader>
<front><div type="abstract" xml:lang="en">Abstract: This paper presents a quantitative comparison of six algorithms for page segmentation: X-Y cut, smearing, whitespace analysis, constrained text-line finding, Docstrum, and Voronoi-diagram-based. The evaluation is performed using a subset of the UW-III collection commonly used for evaluation, with a separate training set for parameter optimization. We compare the results using both default parameters and optimized parameters. In the course of the evaluation, the strengths and weaknesses of each algorithm are analyzed, and it is shown that no single algorithm outperforms all other algorithms. However, we observe that the three best-performing algorithms are those based on constrained text-line finding, Docstrum, and the Voronoi-diagram.</div>
</front>
</TEI>
<affiliations><list><country><li>Allemagne</li>
<li>Niger</li>
</country>
<region><li>Rhénanie-Palatinat</li>
</region>
<settlement><li>Kaiserslautern</li>
</settlement>
<orgName><li>Université de technologie de Kaiserslautern</li>
</orgName>
</list>
<tree><country name="Allemagne"><region name="Rhénanie-Palatinat"><name sortKey="Shafait, Faisal" sort="Shafait, Faisal" uniqKey="Shafait F" first="Faisal" last="Shafait">Faisal Shafait</name>
</region>
<name sortKey="Breuel, M" sort="Breuel, M" uniqKey="Breuel M" first="M." last="Breuel">M. Breuel</name>
<name sortKey="Keysers, Daniel" sort="Keysers, Daniel" uniqKey="Keysers D" first="Daniel" last="Keysers">Daniel Keysers</name>
</country>
<country name="Niger"><noRegion><name sortKey="Shafait, Faisal" sort="Shafait, Faisal" uniqKey="Shafait F" first="Faisal" last="Shafait">Faisal Shafait</name>
</noRegion>
<name sortKey="Breuel, M" sort="Breuel, M" uniqKey="Breuel M" first="M." last="Breuel">M. Breuel</name>
<name sortKey="Keysers, Daniel" sort="Keysers, Daniel" uniqKey="Keysers D" first="Daniel" last="Keysers">Daniel Keysers</name>
</country>
</tree>
</affiliations>
</record>
Pour manipuler ce document sous Unix (Dilib)
EXPLOR_STEP=$WICRI_ROOT/Ticri/CIDE/explor/OcrV1/Data/Main/Exploration
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 001029 | SxmlIndent | more
Ou
HfdSelect -h $EXPLOR_AREA/Data/Main/Exploration/biblio.hfd -nk 001029 | SxmlIndent | more
Pour mettre un lien sur cette page dans le réseau Wicri
{{Explor lien |wiki= Ticri/CIDE |area= OcrV1 |flux= Main |étape= Exploration |type= RBID |clé= ISTEX:73191E813FC49126BDFEF87AA23D9270B36FD3A7 |texte= Performance Comparison of Six Algorithms for Page Segmentation }}
This area was generated with Dilib version V0.6.32. |